Predictive Modeling of Weather Station Data:

Linear Regression vs. Graph Neural Network

Colby Fenters & Lilith Holland (Advisor: Dr. Cohen)

August 4, 2025

Introduction

  • Accurate weather prediction is a crucial task with widespread implications across many fields.
  • Traditional forecasting methods often rely on statistical models or physics-based simulations.
  • In this project, we explore the predictive power of a traditional linear regression model and a GNN on real-world weather station data (Herzmann 2023).

Methodology

  • Cleaning Process
  • Correlation Analysis
  • Feature Imputation
  • Graph Neural Network
  • Linear Model

Cleaning Process

  • The raw weather station data underwent a multistage cleaning and preprocessing procedure designed to ensure temporal consistency, handle missing values, and prepare the data for both linear and GNN-based models.
  • The dataset is of shape (time, feature) where each node has a record for each time step resulting in each time having 8 records, one for each node.
  • Data was downsampled from 1-hour intervals to 6-hour intervals to reduce noise and improve model efficiency.
  • The final dataset was transformed into a 3D array off shape (time, station, feature).

Correlation Analysis

  • The correlation analysis is done to minimize data leakage as many weather features are directly calculated from temperature.
  • Inter-node correlation had each node was broken into its own dataset and then within the node all features were compared against each other.
  • Intra-node correlationhad each feature was broken into its own dataset and then within the feature all nodes were compared against each other.

Feature Imputation

  • Interpolation was incredibly important for this task as the data used in this project comes from real world weather sensors and as such is plagued with quality issues.
  • For the spatial imputation step each time step was analysed and any with missing features had the missing values calculated based on the neighboring nodes and the graph edge weights.
  • In situations where all nodes are missing a feature it is impossible to perform the spatial imputation step and as such a more naive temporal imputation was required.

Graph Neural Network

  • The GNN is structured to model the spatiotemporal dynamics of the weather station network. It is implemented through PyTorch, the architecture is inspired by the Diffusion Convolutional Recurrent Neural Network (DCRNN) (Li et al. 2018).
  • The model was tested against the last 6 months of the selected dataset.
  • The GNN uses the preceding 28 time steps to forecast the next temperature value.

Linear Model

  • The linear baseline model was designed as a univariate time-series regression task.
  • At each time step t, the input vector aggregates the five base features across all stations:

\[ \text{tmpf}_{t+1}=\text{features}_t+\text{features}_{t-1}+...+\text{features}_{t-27} \]

Where:

\[ \text{features}_t=\text{\{tmpf, }\text{relh, }\text{sknt, }\text{drct}_{sin}\text{, drct}_{cos}\text{\}} \]

tmpf: Temperature

relh: Relative Humidity

sknt: Wind Speed

\(\text{drct}_{sin}\text{, drct}_{cos}\text{: }\)Wind Direction encoded as sine and cosine components

Analysis

  • Data Source
  • Data Structure
  • Exploratory Data Analysis
  • Graph Creation
  • Spatiotemporal Imputation
  • Correlation Analysis
  • Final Preparation

Data Source

  • The dataset used in this project was sourced from the Iowa Environmental Mesonet (IEM) hosted by the Iowa State University (Herzmann 2023).
  • The data follows observational standards set by the FAA Automated Surface Observing System (ASOS) (Administration 2021).

Data Structure

  • The original dataset contains:
    • 33 features
    • 8 stations
    • 96,408 hourly time steps
  • With intermittent missing values across both time and stations
Feature Description
station Station identifier code (3-4 characters)
valid Timestamp of the observation
lon Longitude
lat Latitude
elevation Elevation in feet
tmpf Air temperature (F)
relh Relative humidity (%)
drct Wind direction (degrees)
sknt Wind speed (knots)
p01i Precipitations (inches) over the previous hour
vsby visibility (miles)

Exploratory Data Analysis

  • Initial exploratory analysis focused on filtering out low-quality features and stations as well as general reduction in the dimensionality of the dataset.
  • The visual below shows all features that meet these conditions for varying time slices, as well as a 0/1 flag for if the station is valid within the time slice.
  • With this visual a date range was selected from 2018 to 2020 as this range had the most valid features and stations while also being quite recent.

Graph Creation

  • To prepare the dataset for graph-based modeling, a spatial graph was constructed.
  • Edge weights were defined as the inverse of the geodesic distance, scaled to a [0, 1] range using MinMax scaler. The closer two stations are, the stronger their connection in the graph.

Spatiotemporal Imputation

  • Missing values were imputed through a two-stage process leveraging both spatial and temporal structure.
  • Spatial Imputation: Each missing value was estimated based on the value of neighboring nodes within the same time step, weighted by graph connectivity.
  • Temporal Imputation: Remaining gaps were filled by interpolating along the time axis for each node individually.

Below is an example of the data requiring both spatial and temporal imputation: Below is the same data post imputation:

Correlation Analysis

  • To avoid feature redundancy and data leakage, a correlation analysis was conducted.
  • Inter-node correlation was done to find any relationships between features within a node.
  • Intra-node correlation was done to explore the importance of spatial information across features.

Below shows the inter-node correlation: Below shows the intra-node correlation:

Final Preparation

  • After all preprocessing steps, the final dataset was reduced and standardized.
    • 5 features were retained: tmpf, relh, sknt, drct_sin, drct_cos.
    • 7 stations remained after filtering.
    • 4,381 time steps at 6-hour intervals (equivalent to 2 years of data) remained.

Modeling and Results

  • Model 1: GNN
  • Model 2: LR
  • Error Comparison
  • Key Findings

Model 1: GNN

  • The Graph Neural Network was trained using the previous 28 time steps (equivalent to 7 days) and leveraged a dense spatial graph connecting all stations.
  • This structure enabled the model to learn both temporal sequences and spatial diffusion patters across the weather station network.
  • The model had a final MSE of 0.0562.

GNN Results

Model 2: LR

  • The linear baseline model was trained using the same 28-time-step history with aggregated weather station data to predict the next time step’s temperature.
  • All features were flattened into a single vector, treating the problem as a high-dimensional regression task with no spatial awareness.
  • The model had a final MSE of 0.0147.

LR Results

Error Comparison

  • When both models are compared against each other per station it becomes apparent how poorly the GNN is performing on these predictive tasks.
  • Overall the LM shows an average low error rate as well as more consistent results.

Key Findings

  • Small Graph structures do not matter.
  • Temporal Context is Crucial.
  • Feature Engineering Adds Value.
  • Graph Structure is Important.
  • Static Graphs are Restrictive.

Conclusion

  • Key Results
  • Future Work

Key Results

  • A Graph Neural Network trained on a spatiotemporal weather data may not outperform a standard linear regression baseline for short-term temperature prediction.
  • Incorporating spatial structure through graph edges enabled the model to learn regional weather interactions that linear models could not.
  • Careful data preprocessing, including imputation, scaling, and circular feature handling was essential to achieving strong performance from both models.
  • GNNs are incredibly sensitive to parameter tuning and may outperform if provided a much larger model structure or more careful tuning.

Future Work

  • These findings demonstrate the potential of traditional models compared to graph-based deep learning approaches.
  • it is apparent that the reliance on aggregating stations as is done with the linear model most likely only worked due to the close proximity of the stations.
  • we still believe there is potential in the application of a graph-based deep learning approach when it comes to large scale weather forecasting.
  • If the dataset was instead made from all of the stations across the US it most likely wouldn’t be possible to aggregate in a way that still preserves spatial information.

References

Administration, Federal Aviation. 2021. “Automated Surface Observing System.” https://www.weather.gov/media/asos/aum-toc.pdf.
Herzmann, Daryl. 2023. “IEM:: Download ASOS/AWOS/Metar Data.” Iowa Environmental Mesonet. https://www.mesonet.agron.iastate.edu/request/download.phtml?network=KS_ASOS.
Li, Yaguang, Rose Yu, Cyrus Shahabi, and Yan Liu. 2018. “Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting.” https://arxiv.org/abs/1707.01926.